Information-Theoretically Optimal Histogram Density Estimation
نویسندگان
چکیده
We regard histogram density estimation as a model selection problem. Our approach is based on the information-theoretic minimum description length (MDL) principle. MDLbased model selection is formalized via the normalized maximum likelihood (NML) distribution, which has several desirable optimality properties. We show how this approach can be applied for learning generic, irregular (variable-width bin) histograms, and how to compute the model selection criterion efficiently. We also derive a dynamic programming algorithm for finding both the NML-optimal bin count and the cut point locations in polynomial time. Finally, we demonstrate our approach via simulation tests.
منابع مشابه
Optimally Learning Populations of Parameters
Consider the following fundamental estimation problem: there are n entities, each with an unknown parameter pi ∈ [0, 1], and we observe n independent random variables,X1, . . . , Xn, withXi ∼Binomial(t, pi). How accurately can one recover the “histogram” (i.e. cumulative density function) of the pis? While the empirical estimates would recover the histogram to earth mover distance Θ( 1 √ t ) (e...
متن کاملMDL Histogram Density Estimation
We regard histogram density estimation as a model selection problem. Our approach is based on the information-theoretic minimum description length (MDL) principle, which can be applied for tasks such as data clustering, density estimation, image denoising and model selection in general. MDLbased model selection is formalized via the normalized maximum likelihood (NML) distribution, which has se...
متن کاملImproving Accuracy and E ciency of Mutual Information for Multi-modal Retinal Image Registration using Adaptive Probability Density Estimation
Mutual Information (MI) is a popular similarity measure for performing image registration between di↵erent modalities. MI makes a statistical comparison between two images by computing the entropy from the probability distribution of the data. Therefore, to obtain an accurate registration it is important to have an accurate estimation of the true underlying probability distribution. Within the ...
متن کاملImproving accuracy and efficiency of mutual information for multi-modal retinal image registration using adaptive probability density estimation
Mutual information (MI) is a popular similarity measure for performing image registration between different modalities. MI makes a statistical comparison between two images by computing the entropy from the probability distribution of the data. Therefore, to obtain an accurate registration it is important to have an accurate estimation of the true underlying probability distribution. Within the...
متن کاملLearning Populations of Parameters
Consider the following estimation problem: there are n entities, each with an unknown parameter pi ∈ [0, 1], and we observe n independent random variables, X1, . . . , Xn, with Xi ∼ Binomial(t, pi). How accurately can one recover the “histogram” (i.e. cumulative density function) of the pi’s? While the empirical estimates would recover the histogram to earth mover distance Θ( 1 √ t ) (equivalen...
متن کامل